Gaussian Process Models for HRTF based Sound-Source Localization and Active-Learning
نویسندگان
چکیده
From a machine learning perspective, the human ability localize sounds can be modeled as a non-parametric and non-linear regression problem between binaural spectral features of sound received at the ears (input) and their sound-source directions (output). The input features can be summarized in terms of the individual’s head-related transfer functions (HRTFs) which measure the spectral response between the listener’s eardrum and an external point in 3D. Based on these viewpoints, two related problems are considered: how can one achieve an optimal sampling of measurements for training sound-source localization (SSL) models, and how can SSL models be used to infer the subject’s HRTFs in listening tests. First, we develop a class of binaural SSL models based on Gaussian process regression and solve a forward selection problem that finds a subset of inputoutput samples that best generalize to all SSL directions. Second, we use an active-learning approach that updates an online SSL model for inferring the subject’s SSL errors via headphones and a graphical user interface. Experiments show that only a small fraction of HRTFs are required for 5◦ localization accuracy and that the learned HRTFs are localized closer to their intended directions than non-individualized HRTFs.
منابع مشابه
N-dimensional N-microphone sound source localization
This paper investigates real-time N-dimensional wideband sound source localization in outdoor (far-field) and lowdegree reverberation cases, using a simple N-microphone arrangement. Outdoor sound source localization in different climates needs highly sensitive and high-performance microphones, which are very expensive. Reduction of the microphone count is our goal. Time delay estimation (TDE)-b...
متن کاملThe bat head-related transfer function reveals binaural cues for sound localization in azimuth and elevation.
Directional properties of the sound transformation at the ear of four intact echolocating bats, Eptesicus fuscus, were investigated via measurements of the head-related transfer function (HRTF). Contributions of external ear structures to directional features of the transfer functions were examined by remeasuring the HRTF in the absence of the pinna and tragus. The investigation mainly focused ...
متن کاملMonaural Sound Localization
The principles of human sound localization imply binaural (interaural level and time difference) as well as monaural cues. The latter are captured by the head-related transfer functions (HRTFs), which describe the direction-dependent, spectral shaping of the incident sound wave, and can be exploited to determine the direction. In this paper an accurate talker localization strategy in the horizo...
متن کاملApplying scattering theory to robot audition system: robust sound source localization and extraction
Robot audition by its own ears (microphones) is essential for natural human-robot communication and interface. Since a microphone is embedded in the head of a robot, the head-related transfer function (HRTF) plays an important role in sound source localization and extraction. Usually, from binaural input, the interaural phase difference (IPD) and interaural intensity difference (IID) are calcul...
متن کاملEpipolar geometry based sound localization and extraction for humanoid audition
Sound localization for a robot or an embedded system is usually solved by using Interaural Phase Di erence (IPD) and Interaural Intensity Di erence (IID). These values are calculated by using Head-Related Transfer Function (HRTF). However, HRTF depends on the shape of head and also changes as environments changes. Therefore, sound localization without HRTF is needed for real-world applications....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1502.03163 شماره
صفحات -
تاریخ انتشار 2015